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BACKGROUND OF THE INVENTION 
[01] This invention relates in general to digital signal processing and more specifically to 
frequency-domain scaling in discrete cosine transform applications. 

[02] Frequency-domain transforms are a valuable operation in many computer processing 
applications. Common transforms include Fourier Transform (FT), Discrete Fourier 
Transform (DFT), Fast Fourier Transform (FFT), Discrete Cosine Transform (DCT), 
Modified DCT (MDCT), etc. Such transforms, and techniques for implementing the 
transforms, are shown, e.g., in 'The Transform and Data Compression Handbook," Rao, K.R. 
and Yip, P.C., CRC Press, New York, 2000; and in "The quick Fourier transform: a FFT 
based on Symmetries," Guo, PL; Sitton, G.A. and Burrus, C.S.; IEEE Trans, on Signal 
Processing, Vol.46, No.2, 1998. 

[03] The DCT is used extensively in audio and image processing. For example, many 
compression standards such as Joint Photographic Expert's Group (JPEG) and Motion 
Picture Experts Group (MPEG) use the DCT at the heart of their processing. 
[04] A problem with implementing frequency-domain transforms in digital processing 
systems is that the operations to compute the transform, and inverse transform, can be 
complex. Often, the accurate calculation of the transform requires performing high-precision 
multiplication and addition on many values. For example, there may be many hundreds of 
thousands, millions, or more, pixels in a single frame of an image. Frames of an image may 
be required to be processed at rates of, e.g., 24, 30 or 60 frames per second. The transforms 
may be required to operate in real time on the frames of pixels values. Thus, any small 
improvement in speed, precision or efficiency of the low-level transform calculations can 
often lead to a tremendous improvement in the overall processing. Such improvements can 
result in a higher-quality image given a fixed set of resources such as processing power and 
memory, in a device such as a consumer electronic playback system for audio and/or video. 
[05] Four traditional types of DCTs are referred to as DCT-I, DCT-II, DCT-III and DCT- 
IV, as defined in the equations, below. These four types can be calculated via one another. 
For example, an N-point DCT-II could be decomposed into N/2-point DCT-I and N/2 -point 
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DCT II, the later could be farther discomposed until not any multiplication is needed. In 
image processing a two-dimensional DCT is used. The two-dimensional DCT can be 
obtained by performing multiple 1 -dimensional DCTs with proper matrix transpositions. 
Other transforms can be achieved with the DCTs shown below. For example, a DFT could 
be implemented by DCT-I. 

DCT-I: 

DCT(k,N+l,x) = Yx(n)cos(—) (k=0,l,2, N); (1) 

DCT-II: 

DCT(k,N,x) = |jjc(n)cos( g^ — ) (k=0,l,2, .N-l); (2) 

to 2Af 

DCT-III: 

DCT(k,N,x) = ^x(n)cos( m( l k J l) ) (k=0,l,2, N-l); (3) 

n =o 2iV 

DCT-IV: 

DCT(k, N,x) = g^cosC ^ 2 " + + 1) ) (k=0,l,2, .N-l); (4) 

where x(n) is the time-domain sequence. 

[06] The prior art has many different approaches and optimization techniques for DCT 
processing. Fig. 1, shows a typical approach using decomposition. In Fig. 1., flowchart 10 
illustrates basic steps in a process whereby a time-domain signal is decomposed, subjected to 
2-point DCT and then composed to produce a full DCT transform. 
[07] Flowchart 10 of Fig. 1 is entered at step 12. At step 14, the time-domain sequence, 
x(n) (e.g., a function describing picture element values in an image, audio waveform sample 
values, etc.) , is a discrete function of N points. Since it is desirable to perform a DCT on two 
points at a time, the N point sequence is successively "decomposed" into sequences of length 
Nj until Nj = 2. This is shown by step 18 which performs the decomposition to result in x A (n) 
having a lesser sequence. A check is made at step 20 to determine whether the decomposed 
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sequence is of length 2. If not, the decomposition step is repeated until sequences of length 2 
are attained. 

[08] Next, at step 30, two-point DCTs are performed to yield x'(k), where k represents a 
frequency band and the value of x A (k) represents the magnitude, or power, of the frequency 
band, as is known in the art. Steps 32, 34 and 36 perform a composition of the 2-point DCTs 
to build up the resulting frequency-domain function x(k) to a function, or sequence, of N 
coefficients. Thus, step 34 successively performs a composition function on the results of the 
2-point DCT to yield x'(k) as shown at 32. For each iteration of the composition step, a 
larger set of coefficients, Nj, is attained. At step 36 a check is made as to whether Nj = N. If 
not, composition is continued. 

[09] When Nj = N at step 36 the result is an N coefficient DCT, x(k) and the transform 
terminates at step 40. 

[10] One drawback of the prior art approaches is that scaling of parameters is typically 

performed in the time-domain at decomposition step 18. The factor - is used to 

2cos( — ) 

scale coefficients in the time-domain sequence. Since this factor results in very large values 
it is difficult to represent the factor, and subsequent operations with the factor, efficiently in 
the binary arithmetic that is used in digital processors. When the word size of a central 
processing unit (CPU) is limited the handling of small fractional numbers with a high degree 
of precision can require double-width, or larger, representations and slower floating point 
arithmetic operations. Truncating or rounding the values loses precision and degrades the 
accuracy of the transform. The introduction of a large scale factor at step 18 and 16 of Fig. 1 
also means that the effects of the large number representations are propagated to subsequent 
steps in the calculations. Other effects from time-domain scaling approaches of the prior art 
can include increased memory requirements and larger bandwidth for data transfers within a 
processing system. 

SUMMARY OF THE INVENTION 
[11] The present invention uses frequency-domain scaling for DCTs. Scale factors are 
applied to coefficients during the final steps of composition of 2-point DCTs. The number of 
multiplications and required precision are reduced. Fixed values are derived from the known 
length of the time-domain sequence. Some fixed values can be derived independently of the 
length of the time-domain sequence. The approach of the invention can also reduce the 



4 



number of multiplications to compute the transform, and allow smaller bit-width sizes by 
reducing the number of required high-precision calculations. 

[12] In one embodiment the invention provides a method for performing a frequency- 
domain transform on a time-domain signal having a sequence length N, wherein the method 
is executed by a processor, the method comprising decomposing the time-domain signal to a 
plurality of decomposed signals, wherein each of the plurality of decomposed signals 
includes a sequence length less than N; performing a transform on the plurality of 
decomposed signals to obtain a transformed signal; composing the plurality of transformed 
signals to obtain a composed signal, including a substep of scaling at least one of the 
transformed signals. 

BRIEF DESCRIPTION OF THE DRAWINGS 
Fig. 1 is a prior art approach to obtaining a discrete cosine transform; 
Fig. 2 illustrates an approach of the invention in obtaining a discrete cosine transform; 

and 

Fig. 3 shows a flowchart of basic steps to achieve a transform according to a specific 
embodiment of the invention. 

DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS 
[13] One aspect of the invention uses scale factor multiplication in a composition phase of 
a system similar to that of Fig. 1 in the prior art. By moving the scale factor multiplication to 
a late step in the transform procedure, the number of high-precision multiplications required 
to implement a DCT can be reduced. In a preferred embodiment, the scale factor 
multiplication is done in the frequency domain at the composition step. Since, for a known 
sequence length N, some of the scale factors can be computed prior to composition this 
allows some of the processing to be done "offline" or in non-real time as opposed to 
performing the computation in real time as, for example, when data is streaming in to an 
encoder. Additionally, some of the scale factor values may be known constants, such as zero, 
so that the computation for the corresponding coefficients can be omitted to further optimize 
processing. 

[14] A mathematical basis for scaling in the frequency domain is shown in the derivation 
in the following sections. The derivation uses standard techniques as will be recognized by 
one of skill in the art. The fundamental mathematical rules, notation and basis upon which 
the derivation is based can be found, generally, in texts such as those cited above to Rao, 
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K.R. and Yip, P.C. These texts are hereby incorporated by reference as if set forth in full in 
this document for all purposes. The derivation shows that Equation (1), above, can be 
decomposed into Equation (5), below. 



DCT(k, N + l,x) = DCT(k, — + l,x e ) + l —- DCT{k, — + \,x o ),(k=0,l — - 1 ) 

2 2cosA 2 2 

N 

N N N 

DCT{- ,N + l,x) = DCT(Z , y + 1, x e ) (5) 

DCT(N - k, N + 1, x) = DCT(k, — + \,x e ) DCT(k, — + l,x 0 ), 

2 /, 2 

2cos( — ) 

N 

N 

(k=0,l, --1) 

where ^^^and are length-( 2+1) sequence and are defined by 
x e (n) = x(2n), (n=0,l, 

x o (n) = x(2n-l) + x(2n + l), (n=l,2, y-1) (6) 

^ o (0) = x(l) 



To prove Eqn.(5), we use Eqn.(l) and obtain: 

DCT(k,N + l,x) = J]x(«)cos(— ) =]T;c(2h)cos(— ) +]Tx'(2n + \)cos( * {2n + l)k ) 
{k=0,l N) (7) 

N 

where x'(2n+l)=x(2n+l) (for n=0,l, — -7) and x'(N+l)=0. 

N 

Furthermore, Eqn.(7) becomes for k=0,l, Nand k * — 

2 

N/2 



DCT(k,N + l,x) = DCT(k,— + \,x e ) + 1 , Y 2x'(2n + l)cos(^ilE)cos(— ) 

2 2cosA- ^ ™ 
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= DCT(k,^ + l,x e ) + l — N f[x'(2n + l)cos( " (2 " + 2) * ) +*•(> + l)cos(*^)] 

2cos( — )" =0 ^ ^ 

TV 

=DCT(k,^ + \J e ) + L—^^p - l)cos(^*) + £ *'(2« + Dcos^) 

2 2cos(— ) *=' " »=o " 

Setting x'(-l)-0 and using x'(N+l)=0 , the above equation becomes 



(8) 



DCT(k,N + l,x) = DCT(k,— + l,x £ ) + ^[x'(2n - 1) +x\2n + \)]cos(^^) 

2 2cos(— ) "=° N 

= DCTqJL + Ui.) + L—^WcoaC^*) 

2 2cos(~) »=» * 

At 

= DCT(k, £ + 1, i.) + l—DCTik, ^ + 1, x.) 

2 2cos(— ) 2 

(Jt=0,7, and**— ). ' (9) 

N 

for £ = — , using Eqn.(l) results in 
2 

DCny,A r + l,x) = i)Cr(^,^ + l,jc e ) (10) 
N 

In addition, for k=0,l,. . .. — , using Eqn.(l) 

2 

Z)C7/( TV - k, y + 1, jc, ) = DCT{k, y + (11) 
Z)C7-(7V-^y + l,x 0 ) = Z)Cr(^y + l,x 0 ) 

Eqns.(9-1 1) conclude the proof of Eqn.(5). 

From Eqn.(5), we could see the factor is only related with frequency domain index "k" 

2cos( — ) 

A' 

which is quite different from other DCT related algorithms. With this, we could move this 
factor to the final step of the computation and the algorithm becomes 



DCT(k, N + l,x) = S(k) * SDCT{k, N + \,x), (k=0, 1, N) 
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(12) 



N N N 

5DCT(y , N + 1, X) = 5DCr(y , y + 1, X,) 

nk N \ N 

SDCT(N - k, N + 1, x) = cos(— ) * SDCT(k, — + 1, jc e ) - - — + 1, x 0 ) 

TV^ 2 2 2 

N 

fl-ft/ f y-U 

where £(&) is called as a scaling factor in frequency domain and obtained by 

M 

S(k) = Y\c(k,j) 
and 

c(kJ) = ^3 if cos( — ) ?t 0 

K N I2 U -J' 

otherwise 



(13) 



[15] Thus, the use of a scale factor of the form in a time-domain operation 

2cos( — ) 
N 

performed in an early step (i.e., decomposition) is omitted in favor of use of a scale factor of 

the form ^— ^— in a frequency-domain operation performed in a later step (e.g., 

2cos(— ) 

composition) as illustrated in Fig. 2. 

[16] With this approach, a preferred embodiment of the invention can implement the 
transform by using steps as shown in Fig. 3. 

[17] In Fig. 3, flowchart 300 is an outline of basic steps of a preferred embodiment to 
practice the invention. Flowchart 300 is entered at step 302 when it is desired to compute a 
length (N+l) DCT. Next, step 304 is executed to use Eqn. (6) to recursively compute shorter 
and shorter sequences via time-domain computation until the sequence length, Ni, becomes 2. 
Naturally, other embodiments can use sequences of any desired length. Next, step 306 is 
executed to obtain a DCT for the sequence obtained from step 304. 
[18] Step 308 is a frequency-domain computation stage whereupon processing in 
accordance with the last three equations of Eqn. (12) are used to obtain an SDCT with length 
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(N+l). Step 310 is executed to perform scaling in the frequency domain to obtain the desired 
length (N+l) DCT by using the first equation of Eqn. (12). In order to help optimize this 
step, scaling factors which have fixed values for a given N (e.g., obtained from Eqn. (13)) can 
be computed "offline" or prior to the execution of the bulk of transform computation. The 
routine exits at step 312. 

[19] Although the invention has been described with respect to specific embodiments 
thereof, these embodiments are merely illustrative, and not restrictive of the invention. For 
example, although the invention has been discussed primarily with respect to DCTs, any 
other type of frequency-domain transform can be suitable for use with the invention. 
Although the examples discussed herein include decomposition to a 2-point DCT, benefits of 
the invention can be realized with any type of decomposition/composition algorithm and 
different sizes (e.g., 4, 8 or other) DCTs. It should be apparent that other factors rather than 
the specific scale factors discussed herein can be used in other embodiments. Other 
embodiments may perform a mix of time-domain and transform-domain scaling, since 
benefits can be realized without the need to perform all scaling in the frequency domain. 
[201 Any suitable programming language can be used to implement the routines of the 
present invention including C, C++, Java, assembly language, etc. Different programming 
techniques can be employed such as procedural or object oriented. The routines can execute 
on a single processing device or multiple processors. Although the flowchart format 
demands that the steps be presented in a specific order, this order may be changed. Multiple 
steps can be performed at the same time. The flowchart sequence can be interrupted. The 
routines can operate in an operating system environment or as stand-alone routines occupying 
all, or a substantial part, of the system processing. 

[21] Steps can be performed in hardware or software, as desired. Note that steps can be 
added to, taken from or modified from the steps in the flowcharts presented in this 
specification without deviating from the scope of the invention. In general, the flowcharts are 
only used to indicate one possible sequence of basic operations to achieve a functional aspect 
of the present invention. 

[22] In the description herein, numerous specific details are provided, such as examples of 
components and/or methods, to provide a thorough understanding of embodiments of the 
present invention. One skilled in the relevant art will recognize, however, that an 
embodiment of the invention can be practiced without one or more of the specific details, or 
with other apparatus, systems, assemblies, methods, components, materials, parts, and/or the 
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like. In other instances, well-known structures, materials, or operations are not specifically 
shown or described in detail to avoid obscuring aspects of embodiments of the present 
invention. 

[23] A "computer-readable medium" for purposes of embodiments of the present invention 
may be any medium that can contain, store, communicate, propagate, or transport the 
program for use by or in connection with the instruction execution system, apparatus, system 
or device. The computer readable medium can be, by way of example only but not by 
limitation, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor 
system, apparatus, system, device, propagation medium, or computer memory. 
[24] A "processor" includes any system, mechanism or component that processes data, 
signals or other information. A processor can include a system with a general-purpose 
central processing unit, multiple processing units, dedicated circuitry for achieving 
functionality, or other systems. Processing need not be limited to a geographic location, or 
have temporal limitations. For example, a processor can perform its functions in "real time," 
"offline," in a "batch mode," etc. Portions of processing can be performed at different times 
and at different locations, by different (or the same) processing systems. 
[25] Reference throughout this specification to "one embodiment", "an embodiment", or 
"a specific embodiment" means that a particular feature, structure, or characteristic described 
in connection with the embodiment is included in at least one embodiment of the present 
invention and not necessarily in all embodiments. Thus, respective appearances of the 
phrases "in one embodiment", "in an embodiment", or "in a specific embodiment" in various 
places throughout this specification are not necessarily referring to the same embodiment. 
Furthermore, the particular features, structures, or characteristics of any specific embodiment 
of the present invention may be combined in any suitable manner with one or more other 
embodiments. It is to be understood that other variations and modifications of the 
embodiments of the present invention described and illustrated herein are possible in light of 
the teachings herein and are to be considered as part of the spirit and scope of the present 
invention. 

[26] Embodiments of the invention may be implemented by using a programmed general 
purpose digital computer, by using application specific integrated circuits, programmable 
logic devices, field programmable gate arrays, optical, chemical, biological, quantum or 
nanoengineered systems, components and mechanisms may be used. In general, the 
functions of the present invention can be achieved by any means as is known in the art. 
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Distributed, or networked systems, components and circuits can be used. Communication, or 
transfer, of data may be wired, wireless, or by any other means. 
[27] It will also be appreciated that one or more of the elements depicted in the 
drawings/figures can also be implemented in a more separated or integrated manner, or even 
removed or rendered as inoperable in certain cases, as is useful in accordance with a 
particular application. It is also within the spirit and scope of the present invention to 
implement a program or code that can be stored in a machine-readable medium to permit a 
computer to perform any of the methods described above. 

[28] Additionally, any signal arrows in the drawings/Figures should be considered only as 
exemplary, and not limiting, unless otherwise specifically noted. Furthermore, the term "or" 
as used herein is generally intended to mean "and/or" unless otherwise indicated. 
Combinations of components or steps will also be considered as being noted, where 
terminology is foreseen as rendering the ability to separate or combine is unclear. 
[29] As used in the description herein and throughout the claims that follow, "a", "an", and 
"the" includes plural references unless the context clearly dictates otherwise. Also, as used 
in the description herein and throughout the claims that follow, the meaning of "in" includes 
"in" and "on" unless the context clearly dictates otherwise. 

[30] The foregoing description of illustrated embodiments of the present invention, 
including what is described in the Abstract, is not intended to be exhaustive or to limit the 
invention to the precise forms disclosed herein. While specific embodiments of, and 
examples for, the invention are described herein for illustrative purposes only, various 
equivalent modifications are possible within the spirit and scope of the present invention, as 
those skilled in the relevant art will recognize and appreciate. As indicated, these 
modifications may be made to the present invention in light of the foregoing description of 
illustrated embodiments of the present invention and are to be included within the spirit and 
scope of the present invention. 

[31] Thus, while the present invention has been described herein with reference to 
particular embodiments thereof, a latitude of modification, various changes and substitutions 
are intended in the foregoing disclosures, and it will be appreciated that in some instances 
some features of embodiments of the invention will be employed without a corresponding use 
of other features without departing from the scope and spirit of the invention as set forth. 
Therefore, many modifications may be made to adapt a particular situation or material to the 
essential scope and spirit of the present invention. It is intended that the invention not be 
limited to the particular terms used in following claims and/or to the particular embodiment 
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disclosed as the best mode contemplated for carrying out this invention, but that the invention 
will include any and all embodiments and equivalents falling within the scope of the 
appended claims. 
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